Errors (versions) by matt-aitken · Pull Request #3187 · triggerdotdev/trigger.dev

matt-aitken · 2026-03-05T23:24:10Z

Added versions filtering on the Errors list and page
Added errors stacked bars to the graph on the individual error page

changeset-bot · 2026-03-05T23:24:17Z

⚠️ No Changeset found

Latest commit: e609e8b

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

coderabbitai · 2026-03-05T23:27:03Z

Walkthrough

Adds version-aware error grouping and alerting across frontend, backend, and DB. New UI: LogsVersionFilter and exported VersionsDropdown, status filters, chart legend styling, and list/status badges. Backend presenters (ErrorGroupPresenter, ErrorsListPresenter) now accept and return versions and state; introduces ErrorGroupState, ErrorGroupActions service, deliver/evaluate alert services (DeliverErrorGroupAlertService, ErrorAlertEvaluator), alerts worker tasks, and email template for error alerts. ClickHouse query builders for per-version aggregations and "since" queries were added. Prisma schema and SQL migrations create ErrorGroupState and add project alert config fields.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~150 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The description is incomplete: missing issue reference, incomplete checklist, no testing details provided, and lacks structured sections as required by the template.	Add issue reference (Closes #), complete the checklist, describe testing steps, and include changelog details in the provided template format.
Docstring Coverage	⚠️ Warning	Docstring coverage is 19.23% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Errors (versions)' directly relates to the main objective of adding version filtering to the errors list and page, and is concise and clear.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch errors-add-version-filter

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Tip

CodeRabbit can use TruffleHog to scan for secrets in your code with verification capabilities.

Add a TruffleHog config file (e.g. trufflehog-config.yml, trufflehog.yml) to your project to customize detectors and scanning behavior. The tool runs only when a config file is present.

devin-ai-integration

Devin Review found 2 potential issues.

View 4 additional findings in Devin Review.

devin-ai-integration · 2026-03-05T23:27:34Z

...tes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors._index/route.tsx

🚩 Errors list sparklines don't reflect the version filter

When a user filters errors by version on the list page (errors._index/route.tsx), the occurrencesPromise at line 120-134 chains from the filtered listPromise but calls presenter.getOccurrences() without passing the versions filter. This means the sparkline activity graphs shown in each error row will display occurrences across ALL versions, even when the list itself is filtered to specific versions. This may be intentional (showing full activity context) but could confuse users who expect the sparklines to match the filtered view.

(Refers to lines 120-134)

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-03-05T23:27:35Z

apps/webapp/app/presenters/v3/ErrorGroupPresenter.server.ts

+    const data = buckets.map((epoch) => {
+      const point: Record<string, number | Date> = { date: new Date(epoch * 1000) };
+      for (const version of sortedVersions) {
+        point[version] = byBucketVersion.get(`${epoch}:${version}`) ?? 0;
+      }
+      return point;
+    });


🟡 Version string used as Record key can collide with reserved date key, breaking the chart

In getOccurrences, version strings are used as dynamic keys in the same Record<string, number | Date> that already has a date key (apps/webapp/app/presenters/v3/ErrorGroupPresenter.server.ts:214-217). If a task_version value is literally "date" (or empty-string mapped to "unknown" which is safe, but "date" is not guarded), the version count (a number) overwrites the date field (a Date). Downstream in ActivityChart (apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors.$fingerprint/route.tsx:428), d.date instanceof Date would be false and new Date(smallNumber).getTime() would produce a nonsensical timestamp (epoch 1970), completely breaking the chart's x-axis. The same collision can occur with "__timestamp" added at route.tsx:427.

Prompt for agents

In apps/webapp/app/presenters/v3/ErrorGroupPresenter.server.ts, the getOccurrences method (lines 213-219) uses version strings as keys in a Record that also has a "date" key. To prevent collisions, either: 1. Prefix version keys with a safe namespace (e.g., "v:" prefix) and update the chart consumer to strip the prefix when displaying labels, OR 2. Restructure the data to use a separate nested object for version counts, e.g.: { date: Date, versions: Record<string, number> } and update the ChartRoot consumer in the fingerprint route (route.tsx lines 475-496) to match. Also in apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors.$fingerprint/route.tsx line 427, the __timestamp key added via spread could similarly collide if a version were named "__timestamp". Guard against both reserved keys ("date" and "__timestamp").

Was this helpful? React with 👍 or 👎 to provide feedback.

coderabbitai

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.alerts/route.tsx (1)
560-574: ⚠️ Potential issue | 🔴 Critical

Missing ERROR_GROUP case in alertTypeTitle will cause runtime error.

The ProjectAlertType enum now includes ERROR_GROUP (added in this PR's schema changes), but alertTypeTitle doesn't handle it. When an alert channel with ERROR_GROUP type is displayed in the table (line 237), this will throw an error.
🐛 Proposed fix
 export function alertTypeTitle(alertType: ProjectAlertType): string {
   switch (alertType) {
     case "TASK_RUN":
       return "Task run failure";
     case "TASK_RUN_ATTEMPT":
       return "Task attempt failure";
     case "DEPLOYMENT_FAILURE":
       return "Deployment failure";
     case "DEPLOYMENT_SUCCESS":
       return "Deployment success";
+    case "ERROR_GROUP":
+      return "Error group";
     default: {
       throw new Error(`Unknown alertType: ${alertType}`);
     }
   }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/webapp/app/routes/_app.orgs`.$organizationSlug.projects.$projectParam.env.$envParam.alerts/route.tsx
around lines 560 - 574, The switch in alertTypeTitle(ProjectAlertType) is
missing a handler for the new "ERROR_GROUP" enum value and will throw for that
case; update the alertTypeTitle function to add a case for "ERROR_GROUP" (e.g.,
case "ERROR_GROUP": return "Error group";) so all ProjectAlertType variants are
handled and the default throw is avoided when ERROR_GROUP alerts are rendered
(used elsewhere where alerts are displayed).
apps/webapp/app/presenters/v3/ErrorGroupPresenter.server.ts (1)
64-85: ⚠️ Potential issue | 🔴 Critical

getSummary() declares ErrorGroupSummary as return type but returns an incomplete object without state.

The getSummary() method at line 249 declares a return type of Promise<ErrorGroupSummary | undefined>, where ErrorGroupSummary.state is required. However, the method returns an object (lines 276–285) that omits the state field entirely. While the call() method does assign state after retrieval (line 149), this violates the type contract of getSummary(). Either make state optional in the type definition, include state in getSummary()'s return object, or update getSummary()'s return type to reflect the incomplete object it actually returns.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/presenters/v3/ErrorGroupPresenter.server.ts` around lines 64
- 85, getSummary() claims to return ErrorGroupSummary but omits the required
state field; update getSummary() to include a state property in its returned
object (either populated from the same data source used by call() or a sensible
default with fields like status, resolvedAt, resolvedInVersion, resolvedBy,
ignoredUntil, ignoredReason, ignoredByUserId, ignoredUntilOccurrenceRate,
ignoredUntilTotalOccurrences) so the returned object matches the
ErrorGroupSummary type, and remove any subsequent reassignment in call() that
assumes state is missing.

🧹 Nitpick comments (4)

internal-packages/clickhouse/src/index.ts (1)
248-263: Add @crumbs tracing annotations for the new error-query paths.

Please add crumb comments around the newly added errors query builder exposure to align with repository tracing standards.

As per coding guidelines, **/*.{ts,tsx,js}: “Add crumbs as you write code using // @crumbs comments or // #region @crumbs blocks for agentcrumbs debug tracing”.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal-packages/clickhouse/src/index.ts` around lines 248 - 263, The new
errors getter (get errors) that exposes multiple query-builder helpers
(getGroups, getInstances, getHourlyOccurrences, affectedVersionsQueryBuilder,
listQueryBuilder, occurrencesListQueryBuilder, createOccurrencesQueryBuilder,
createOccurrencesByVersionQueryBuilder, occurrenceCountSinceQueryBuilder,
activeErrorsSinceQueryBuilder, occurrenceCountsSinceQueryBuilder) needs
repository-standard tracing crumbs; wrap the errors block with // `@crumbs`
comments (or a // `#region` `@crumbs` ... // `#endregion` `@crumbs` block) immediately
before and after the getter so agentcrumbs can trace calls to these functions,
ensuring the comments surround the get errors { ... } block and reference the
same function names for clarity.
apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors.$fingerprint/route.tsx (1)
630-632: Consider a more reliable approach for closing the dialog after submission.

Using setTimeout(onClose, 100) is fragile—if the submission takes longer than 100ms to start, the dialog may close before the request begins. Consider using the fetcher's state to close the dialog when submission completes:
♻️ Suggested alternative
// In CustomIgnoreForm, use useEffect to close on success:
import { useEffect } from "react";

// ...

useEffect(() => {
  if (fetcher.state === "idle" && fetcher.data?.ok) {
    onClose();
  }
}, [fetcher.state, fetcher.data, onClose]);

// Remove setTimeout from onSubmit
Or simply close immediately on submit and let the optimistic UI handle it if that's the intended behavior.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/webapp/app/routes/_app.orgs`.$organizationSlug.projects.$projectParam.env.$envParam.errors.$fingerprint/route.tsx
around lines 630 - 632, The current onSubmit handler uses setTimeout(onClose,
100) which is fragile; remove the setTimeout call in the onSubmit prop of
CustomIgnoreForm and instead close the dialog when the fetcher completes by
adding a useEffect in the CustomIgnoreForm component that watches fetcher.state
and fetcher.data (e.g., useEffect(() => { if (fetcher.state === "idle" &&
fetcher.data?.ok) onClose(); }, [fetcher.state, fetcher.data, onClose]));
alternatively, if you intend optimistic UI, call onClose() immediately in
onSubmit and omit the timeout.
apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts (1)
30-46: Prefer a type alias for the payload shape.

This new model should use type instead of interface to stay aligned with the repo's TypeScript rules.

As per coding guidelines, **/*.{ts,tsx}: Use types over interfaces for TypeScript.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts` around
lines 30 - 46, Replace the interface declaration ErrorAlertPayload with a type
alias using the same shape (e.g., type ErrorAlertPayload = { ... }); keep the
property names and nested error object unchanged (including channelId,
projectId, classification, and the error fields like fingerprint, environmentId,
environmentName, taskIdentifier, errorType, errorMessage, sampleStackTrace,
firstSeen, lastSeen, occurrenceCount) so all references to ErrorAlertPayload and
ErrorAlertClassification continue to work; update any imports/exports if
necessary to reflect the type alias.
apps/webapp/app/v3/services/alerts/errorAlertEvaluator.server.ts (1)
16-26: Prefer type aliases for these internal models.

AlertableError and ResolvedEnvironment should use type aliases here to match the repo's TypeScript convention.

As per coding guidelines, **/*.{ts,tsx}: Use types over interfaces for TypeScript.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/v3/services/alerts/errorAlertEvaluator.server.ts` around
lines 16 - 26, Replace the two internal interfaces with type aliases to match
repo convention: convert interface AlertableError and interface
ResolvedEnvironment into type AlertableError = { ... } and type
ResolvedEnvironment = { ... }, keeping the exact same property names and types
(classification, error, environmentName) and (id, type, displayName)
respectively; update any local references if needed but do not change the shape
or visibility.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/webapp/app/presenters/v3/ErrorGroupPresenter.server.ts`:
- Around line 130-145: getState is being called without the task identifier so
it can return the wrong ErrorGroupState when the same fingerprint exists under
multiple tasks; stop fetching summary and state in parallel and instead first
await this.getSummary(...) to read the taskIdentifier, then call
this.getState(environmentId, taskIdentifier, fingerprint) (or otherwise pass the
task identifier into getState) when you run the remaining Promise.all for
getAffectedVersions, getRunList and getState so state lookup is disambiguated by
taskIdentifier.

In `@apps/webapp/app/v3/services/alerts/createAlertChannel.server.ts`:
- Around line 104-112: The helper `#scheduleErrorAlertEvaluation` currently
enqueues a job using the same id pattern "evaluateErrorAlerts:${projectId}"
that's used by the evaluator's delayed self-chain, causing deduplication/no-op
if a future run is already queued; change the id for this immediate kick to be
unique (for example append a timestamp or a "now" suffix like
"evaluateErrorAlerts:${projectId}:now:${Date.now()}") so alertsWorker.enqueue
always creates a distinct job when triggered immediately after channel
creation/edit; keep the job name "v3.evaluateErrorAlerts" and payload identical,
only alter the id generation logic in `#scheduleErrorAlertEvaluation`.

In `@apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts`:
- Around line 239-243: The code logs and throws the raw webhook URL
(webhookProperties.data.url) from the channel object (type
ProjectAlertChannelType), which can leak embedded credentials; update
deliverErrorGroupAlert.server.ts to sanitize/redact webhookProperties.data.url
before any logging or including it in thrown errors (e.g., replace
host/path/query credentials with "[REDACTED]" or strip auth/query params) and
ensure any thrown Error messages or processLogger entries use the redacted
value; apply the same change for the other occurrences around the webhook
handling block (also referenced at the second occurrence near lines 310-316) so
no raw webhook URL is emitted.

In `@apps/webapp/app/v3/services/alerts/errorAlertEvaluator.server.ts`:
- Around line 178-197: The total-occurrence unignore check uses
context.occurrencesSince (window-only) so a threshold like
ignoredUntilTotalOccurrences won’t fire across multiple evaluation windows;
change the logic to compute the cumulative occurrences since the alert was
ignored and compare that to state.ignoredUntilTotalOccurrences. Specifically,
replace the context.occurrencesSince check with a call that gets counts since
state.ignoredAt (or otherwise compute occurrences since state.ignoredAt using
getOccurrenceCountsSince/your occurrence-count helper) and use that cumulative
count in the if that references state.ignoredUntilTotalOccurrences and
state.ignoredAt.
- Around line 102-125: The loop enqueues alerts to every envChannels entry on
every evaluator run, ignoring per-channel intervals; inside the for (const
channel of envChannels) loop (before calling alertsWorker.enqueue) add a guard
that respects the channel's configured interval by comparing it to the
evaluator's current run interval (e.g., currentRunIntervalMinutes or
evaluationIntervalMinutes): only enqueue when the current run aligns with the
channel interval (for example channel.intervalMinutes ===
currentRunIntervalMinutes or channel.intervalMinutes % currentRunIntervalMinutes
=== 0), otherwise continue; update any related loops mentioned (the other
occurrences around the same pattern at the referenced ranges) to use the same
interval-check guard.
- Around line 30-35: The constructor for ErrorAlertEvaluator has its _prisma and
_replica defaults reversed causing writes (e.g., updateErrorGroupStates) to go
to the read-only client; swap the defaults so _prisma defaults to prisma and
_replica defaults to $replica. Update the constructor parameter
order/assignments for ErrorAlertEvaluator accordingly so any write operations
use the primary Prisma client (_prisma = prisma) and reads use the replica
(_replica = $replica).
- Around line 372-387: Remove the stray comma-expression "await this, ..." so
the Prisma update is actually awaited: locate the call to
this._prisma.errorGroupState.update (the block that sets status: "UNRESOLVED"
and clears ignored/resolved fields) and change the statement to await the update
promise directly (i.e., await this._prisma.errorGroupState.update(...)) so the
database write completes before continuing (e.g., before calling selfChain()).

In `@internal-packages/emails/emails/alert-error-group.tsx`:
- Line 58: Change the default export of the component function Email to a named
export (use "export function Email(...)" instead of "export default function
Email(...)") and then update the corresponding import in the emails index module
to use a named import (e.g. import { Email } from '...') or rename on import if
needed; also update any usages that relied on the default import to reference
the named export.

---

Outside diff comments:
In `@apps/webapp/app/presenters/v3/ErrorGroupPresenter.server.ts`:
- Around line 64-85: getSummary() claims to return ErrorGroupSummary but omits
the required state field; update getSummary() to include a state property in its
returned object (either populated from the same data source used by call() or a
sensible default with fields like status, resolvedAt, resolvedInVersion,
resolvedBy, ignoredUntil, ignoredReason, ignoredByUserId,
ignoredUntilOccurrenceRate, ignoredUntilTotalOccurrences) so the returned object
matches the ErrorGroupSummary type, and remove any subsequent reassignment in
call() that assumes state is missing.

In
`@apps/webapp/app/routes/_app.orgs`.$organizationSlug.projects.$projectParam.env.$envParam.alerts/route.tsx:
- Around line 560-574: The switch in alertTypeTitle(ProjectAlertType) is missing
a handler for the new "ERROR_GROUP" enum value and will throw for that case;
update the alertTypeTitle function to add a case for "ERROR_GROUP" (e.g., case
"ERROR_GROUP": return "Error group";) so all ProjectAlertType variants are
handled and the default throw is avoided when ERROR_GROUP alerts are rendered
(used elsewhere where alerts are displayed).

---

Nitpick comments:
In
`@apps/webapp/app/routes/_app.orgs`.$organizationSlug.projects.$projectParam.env.$envParam.errors.$fingerprint/route.tsx:
- Around line 630-632: The current onSubmit handler uses setTimeout(onClose,
100) which is fragile; remove the setTimeout call in the onSubmit prop of
CustomIgnoreForm and instead close the dialog when the fetcher completes by
adding a useEffect in the CustomIgnoreForm component that watches fetcher.state
and fetcher.data (e.g., useEffect(() => { if (fetcher.state === "idle" &&
fetcher.data?.ok) onClose(); }, [fetcher.state, fetcher.data, onClose]));
alternatively, if you intend optimistic UI, call onClose() immediately in
onSubmit and omit the timeout.

In `@apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts`:
- Around line 30-46: Replace the interface declaration ErrorAlertPayload with a
type alias using the same shape (e.g., type ErrorAlertPayload = { ... }); keep
the property names and nested error object unchanged (including channelId,
projectId, classification, and the error fields like fingerprint, environmentId,
environmentName, taskIdentifier, errorType, errorMessage, sampleStackTrace,
firstSeen, lastSeen, occurrenceCount) so all references to ErrorAlertPayload and
ErrorAlertClassification continue to work; update any imports/exports if
necessary to reflect the type alias.

In `@apps/webapp/app/v3/services/alerts/errorAlertEvaluator.server.ts`:
- Around line 16-26: Replace the two internal interfaces with type aliases to
match repo convention: convert interface AlertableError and interface
ResolvedEnvironment into type AlertableError = { ... } and type
ResolvedEnvironment = { ... }, keeping the exact same property names and types
(classification, error, environmentName) and (id, type, displayName)
respectively; update any local references if needed but do not change the shape
or visibility.

In `@internal-packages/clickhouse/src/index.ts`:
- Around line 248-263: The new errors getter (get errors) that exposes multiple
query-builder helpers (getGroups, getInstances, getHourlyOccurrences,
affectedVersionsQueryBuilder, listQueryBuilder, occurrencesListQueryBuilder,
createOccurrencesQueryBuilder, createOccurrencesByVersionQueryBuilder,
occurrenceCountSinceQueryBuilder, activeErrorsSinceQueryBuilder,
occurrenceCountsSinceQueryBuilder) needs repository-standard tracing crumbs;
wrap the errors block with // `@crumbs` comments (or a // `#region` `@crumbs` ... //
`#endregion` `@crumbs` block) immediately before and after the getter so agentcrumbs
can trace calls to these functions, ensuring the comments surround the get
errors { ... } block and reference the same function names for clarity.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: e7e10a3c-d2f2-429c-85bb-c4d210930768

📥 Commits

Reviewing files that changed from the base of the PR and between cd9b5ea and e609e8b.

📒 Files selected for processing (21)

apps/webapp/app/components/logs/LogsVersionFilter.tsx
apps/webapp/app/components/primitives/charts/ChartRoot.tsx
apps/webapp/app/components/runs/v3/RunFilters.tsx
apps/webapp/app/models/projectAlert.server.ts
apps/webapp/app/presenters/v3/ErrorGroupPresenter.server.ts
apps/webapp/app/presenters/v3/ErrorsListPresenter.server.ts
apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.alerts/route.tsx
apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors.$fingerprint/route.tsx
apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors._index/route.tsx
apps/webapp/app/v3/alertsWorker.server.ts
apps/webapp/app/v3/services/alerts/createAlertChannel.server.ts
apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts
apps/webapp/app/v3/services/alerts/errorAlertEvaluator.server.ts
apps/webapp/app/v3/services/errorGroupActions.server.ts
internal-packages/clickhouse/src/errors.ts
internal-packages/clickhouse/src/index.ts
internal-packages/database/prisma/migrations/20260306102053_error_group_state/migration.sql
internal-packages/database/prisma/migrations/20260308181657_add_error_alert_config_to_project_alert_channel/migration.sql
internal-packages/database/prisma/schema.prisma
internal-packages/emails/emails/alert-error-group.tsx
internal-packages/emails/src/index.tsx

🚧 Files skipped from review as they are similar to previous changes (4)

apps/webapp/app/components/logs/LogsVersionFilter.tsx
apps/webapp/app/components/runs/v3/RunFilters.tsx
apps/webapp/app/components/primitives/charts/ChartRoot.tsx
internal-packages/clickhouse/src/errors.ts

coderabbitai · 2026-03-16T23:46:02Z

apps/webapp/app/presenters/v3/ErrorGroupPresenter.server.ts

+    const [summary, affectedVersions, runList, stateRow] = await Promise.all([
      this.getSummary(organizationId, projectId, environmentId, fingerprint),
      this.getAffectedVersions(organizationId, projectId, environmentId, fingerprint),
      this.getRunList(organizationId, environmentId, {
        userId,
        projectId,
        fingerprint,
+        versions,
        pageSize: runsPageSize,
        from: time.from.getTime(),
        to: time.to.getTime(),
        cursor,
        direction,
      }),
+      this.getState(environmentId, fingerprint),
    ]);


⚠️ Potential issue | 🟠 Major

State lookup needs the task identifier.

ErrorGroupState is keyed by environment + taskIdentifier + fingerprint elsewhere in this PR, but this query only filters by environment and fingerprint. If the same fingerprint exists under multiple tasks, the page can show the wrong resolved/ignored state; fetching summary and stateRow in the same Promise.all also prevents you from disambiguating it.

Also applies to: 318-340

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/webapp/app/presenters/v3/ErrorGroupPresenter.server.ts` around lines 130 - 145, getState is being called without the task identifier so it can return the wrong ErrorGroupState when the same fingerprint exists under multiple tasks; stop fetching summary and state in parallel and instead first await this.getSummary(...) to read the taskIdentifier, then call this.getState(environmentId, taskIdentifier, fingerprint) (or otherwise pass the task identifier into getState) when you run the remaining Promise.all for getAffectedVersions, getRunList and getState so state lookup is disambiguated by taskIdentifier.

coderabbitai · 2026-03-16T23:46:02Z

apps/webapp/app/v3/services/alerts/createAlertChannel.server.ts

+  async #scheduleErrorAlertEvaluation(projectId: string): Promise<void> {
+    await alertsWorker.enqueue({
+      id: `evaluateErrorAlerts:${projectId}`,
+      job: "v3.evaluateErrorAlerts",
+      payload: {
+        projectId,
+        scheduledAt: Date.now(),
+      },
+    });


⚠️ Potential issue | 🟠 Major

Don't reuse the periodic job id for the immediate kick.

This helper uses the same evaluateErrorAlerts:${projectId} id that the evaluator's delayed self-chain uses. If a future run is already queued, this enqueue can become a no-op, so newly enabled or edited ERROR_GROUP channels may not be evaluated until the old availableAt.

Suggested fix

async `#scheduleErrorAlertEvaluation`(projectId: string): Promise<void> { await alertsWorker.enqueue({ - id: `evaluateErrorAlerts:${projectId}`, + id: `evaluateErrorAlerts:manual:${projectId}:${Date.now()}`, job: "v3.evaluateErrorAlerts", payload: { projectId, scheduledAt: Date.now(), }, }); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async #scheduleErrorAlertEvaluation(projectId: string): Promise<void> {

await alertsWorker.enqueue({

id: `evaluateErrorAlerts:${projectId}`,

job: "v3.evaluateErrorAlerts",

payload: {

projectId,

scheduledAt: Date.now(),

},

});

async `#scheduleErrorAlertEvaluation`(projectId: string): Promise<void> {

await alertsWorker.enqueue({

id: `evaluateErrorAlerts:manual:${projectId}:${Date.now()}`,

job: "v3.evaluateErrorAlerts",

payload: {

projectId,

scheduledAt: Date.now(),

},

});

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/webapp/app/v3/services/alerts/createAlertChannel.server.ts` around lines 104 - 112, The helper `#scheduleErrorAlertEvaluation` currently enqueues a job using the same id pattern "evaluateErrorAlerts:${projectId}" that's used by the evaluator's delayed self-chain, causing deduplication/no-op if a future run is already queued; change the id for this immediate kick to be unique (for example append a timestamp or a "now" suffix like "evaluateErrorAlerts:${projectId}:now:${Date.now()}") so alertsWorker.enqueue always creates a distinct job when triggered immediately after channel creation/edit; keep the job name "v3.evaluateErrorAlerts" and payload identical, only alter the id generation logic in `#scheduleErrorAlertEvaluation`.

coderabbitai · 2026-03-16T23:46:02Z

apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts

+    channel: {
+      type: ProjectAlertChannelType;
+      properties: unknown;
+      project: { id: string; externalRef: string; slug: string; name: string; organizationId: string; organization: { slug: string; title: string } };
+    },


⚠️ Potential issue | 🟠 Major

Redact webhook URLs from logs and thrown errors.

Webhook URLs commonly embed credentials in the path or query string. Logging webhookProperties.data.url and including it in the thrown error will leak those secrets into worker logs and retry metadata.

Suggested fix

async `#sendWebhook`( channel: { + id: string; type: ProjectAlertChannelType; properties: unknown; project: { id: string; externalRef: string; slug: string; name: string; organizationId: string; organization: { slug: string; title: string } }; }, @@ if (!response.ok) { logger.info("[DeliverErrorGroupAlert] Failed to send webhook", { status: response.status, statusText: response.statusText, - url: webhookProperties.data.url, + channelId: channel.id, }); - throw new Error(`Failed to send error group alert webhook to ${webhookProperties.data.url}`); + throw new Error("Failed to send error group alert webhook"); }

Also applies to: 310-316

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts` around lines 239 - 243, The code logs and throws the raw webhook URL (webhookProperties.data.url) from the channel object (type ProjectAlertChannelType), which can leak embedded credentials; update deliverErrorGroupAlert.server.ts to sanitize/redact webhookProperties.data.url before any logging or including it in thrown errors (e.g., replace host/path/query credentials with "[REDACTED]" or strip auth/query params) and ensure any thrown Error messages or processLogger entries use the redacted value; apply the same change for the other occurrences around the webhook handling block (also referenced at the second occurrence near lines 310-316) so no raw webhook URL is emitted.

coderabbitai · 2026-03-16T23:46:02Z

apps/webapp/app/v3/services/alerts/errorAlertEvaluator.server.ts

+export class ErrorAlertEvaluator {
+  constructor(
+    protected readonly _prisma: PrismaClientOrTransaction = $replica,
+    protected readonly _replica: PrismaClientOrTransaction = prisma,
+    protected readonly _clickhouse: ClickHouse = clickhouseClient
+  ) {}


⚠️ Potential issue | 🔴 Critical

Swap the primary and replica defaults here.

_prisma currently defaults to $replica and _replica defaults to prisma, so updateErrorGroupStates() writes through the read client while the reads hit the primary. On replica-backed deployments that will either fail outright or leave alert state changes unpersisted.

Suggested fix

constructor( - protected readonly _prisma: PrismaClientOrTransaction = $replica, - protected readonly _replica: PrismaClientOrTransaction = prisma, + protected readonly _prisma: PrismaClientOrTransaction = prisma, + protected readonly _replica: PrismaClientOrTransaction = $replica, protected readonly _clickhouse: ClickHouse = clickhouseClient ) {}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/webapp/app/v3/services/alerts/errorAlertEvaluator.server.ts` around lines 30 - 35, The constructor for ErrorAlertEvaluator has its _prisma and _replica defaults reversed causing writes (e.g., updateErrorGroupStates) to go to the read-only client; swap the defaults so _prisma defaults to prisma and _replica defaults to $replica. Update the constructor parameter order/assignments for ErrorAlertEvaluator accordingly so any write operations use the primary Prisma client (_prisma = prisma) and reads use the replica (_replica = $replica).

coderabbitai · 2026-03-16T23:46:03Z

apps/webapp/app/v3/services/alerts/errorAlertEvaluator.server.ts

+    for (const alertable of alertableErrors) {
+      const envChannels = channelsByEnvId.get(alertable.error.environment_id) ?? [];
+      for (const channel of envChannels) {
+        await alertsWorker.enqueue({
+          job: "v3.deliverErrorGroupAlert",
+          payload: {
+            channelId: channel.id,
+            projectId,
+            classification: alertable.classification,
+            error: {
+              fingerprint: alertable.error.error_fingerprint,
+              environmentId: alertable.error.environment_id,
+              environmentName: alertable.environmentName,
+              taskIdentifier: alertable.error.task_identifier,
+              errorType: alertable.error.error_type,
+              errorMessage: alertable.error.error_message,
+              sampleStackTrace: alertable.error.sample_stack_trace,
+              firstSeen: alertable.error.first_seen,
+              lastSeen: alertable.error.last_seen,
+              occurrenceCount: alertable.error.occurrence_count,
+            },
+          },
+        });
+      }


⚠️ Potential issue | 🟠 Major

Longer channel intervals are effectively ignored.

The evaluator schedules the whole project at the shortest configured interval, but the delivery loop still enqueues alerts for every matching channel on every run. That means a channel configured for 60 minutes will still receive alerts every 5 minutes if any sibling channel uses a 5-minute interval.

Also applies to: 212-221, 391-404

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/webapp/app/v3/services/alerts/errorAlertEvaluator.server.ts` around lines 102 - 125, The loop enqueues alerts to every envChannels entry on every evaluator run, ignoring per-channel intervals; inside the for (const channel of envChannels) loop (before calling alertsWorker.enqueue) add a guard that respects the channel's configured interval by comparing it to the evaluator's current run interval (e.g., currentRunIntervalMinutes or evaluationIntervalMinutes): only enqueue when the current run aligns with the channel interval (for example channel.intervalMinutes === currentRunIntervalMinutes or channel.intervalMinutes % currentRunIntervalMinutes === 0), otherwise continue; update any related loops mentioned (the other occurrences around the same pattern at the referenced ranges) to use the same interval-check guard.

coderabbitai · 2026-03-16T23:46:03Z

apps/webapp/app/v3/services/alerts/errorAlertEvaluator.server.ts

+    if (
+      state.ignoredUntilOccurrenceRate !== null &&
+      state.ignoredUntilOccurrenceRate !== undefined
+    ) {
+      const windowMinutes = Math.max(context.windowMs / 60_000, 1);
+      const rate = context.occurrencesSince / windowMinutes;
+      if (rate > state.ignoredUntilOccurrenceRate) {
+        return true;
+      }
+    }
+
+    if (
+      state.ignoredUntilTotalOccurrences !== null &&
+      state.ignoredUntilTotalOccurrences !== undefined &&
+      state.ignoredAt
+    ) {
+      if (context.occurrencesSince >= state.ignoredUntilTotalOccurrences) {
+        return true;
+      }
+    }


⚠️ Potential issue | 🟠 Major

ignoredUntilTotalOccurrences is only checking the latest window.

getOccurrenceCountsSince() always counts from the current scheduledAt, so a total-occurrence threshold only trips if the entire threshold is exceeded in one evaluation cycle. A rule like "unignore after 100 occurrences" will never fire if the error accrues 20 occurrences across five runs.

Also applies to: 317-343

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/webapp/app/v3/services/alerts/errorAlertEvaluator.server.ts` around lines 178 - 197, The total-occurrence unignore check uses context.occurrencesSince (window-only) so a threshold like ignoredUntilTotalOccurrences won’t fire across multiple evaluation windows; change the logic to compute the cumulative occurrences since the alert was ignored and compare that to state.ignoredUntilTotalOccurrences. Specifically, replace the context.occurrencesSince check with a call that gets counts since state.ignoredAt (or otherwise compute occurrences since state.ignoredAt using getOccurrenceCountsSince/your occurrence-count helper) and use that cumulative count in the if that references state.ignoredUntilTotalOccurrences and state.ignoredAt.

coderabbitai · 2026-03-16T23:46:03Z

apps/webapp/app/v3/services/alerts/errorAlertEvaluator.server.ts

+      await this,
+        this._prisma.errorGroupState.update({
+          where: { id: state.id },
+          data: {
+            status: "UNRESOLVED",
+            ignoredUntil: null,
+            ignoredUntilOccurrenceRate: null,
+            ignoredUntilTotalOccurrences: null,
+            ignoredAt: null,
+            ignoredReason: null,
+            ignoredByUserId: null,
+            resolvedAt: null,
+            resolvedInVersion: null,
+            resolvedBy: null,
+          },
+        });


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cat -n apps/webapp/app/v3/services/alerts/errorAlertEvaluator.server.ts | sed -n '365,395p'

Repository: triggerdotdev/trigger.dev

Length of output: 1221

🏁 Script executed:

cat -n apps/webapp/app/v3/services/alerts/errorAlertEvaluator.server.ts | sed -n '360,400p'

Repository: triggerdotdev/trigger.dev

Length of output: 1543

Remove the errant await this, comma expression that prevents the state update from being awaited.

Line 372 has await this, which is a comma operator that discards the awaited this reference and returns the unawaited Prisma promise. The update will execute but won't be awaited, allowing the method to return and subsequent operations (like selfChain()) to run before the database write commits. This creates a race condition where state changes are invisible to the job's continuation.

Fix

- await this, - this._prisma.errorGroupState.update({ + await this._prisma.errorGroupState.update({ where: { id: state.id }, data: { status: "UNRESOLVED", ignoredUntil: null, ignoredUntilOccurrenceRate: null, ignoredUntilTotalOccurrences: null, ignoredAt: null, ignoredReason: null, ignoredByUserId: null, resolvedAt: null, resolvedInVersion: null, resolvedBy: null, }, });

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

await this,

this._prisma.errorGroupState.update({

where: { id: state.id },

data: {

status: "UNRESOLVED",

ignoredUntil: null,

ignoredUntilOccurrenceRate: null,

ignoredUntilTotalOccurrences: null,

ignoredAt: null,

ignoredReason: null,

ignoredByUserId: null,

resolvedAt: null,

resolvedInVersion: null,

resolvedBy: null,

},

});

await this._prisma.errorGroupState.update({

where: { id: state.id },

data: {

status: "UNRESOLVED",

ignoredUntil: null,

ignoredUntilOccurrenceRate: null,

ignoredUntilTotalOccurrences: null,

ignoredAt: null,

ignoredReason: null,

ignoredByUserId: null,

resolvedAt: null,

resolvedInVersion: null,

resolvedBy: null,

},

});

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/webapp/app/v3/services/alerts/errorAlertEvaluator.server.ts` around lines 372 - 387, Remove the stray comma-expression "await this, ..." so the Prisma update is actually awaited: locate the call to this._prisma.errorGroupState.update (the block that sets status: "UNRESOLVED" and clears ignored/resolved fields) and change the statement to await the update promise directly (i.e., await this._prisma.errorGroupState.update(...)) so the database write completes before continuing (e.g., before calling selfChain()).

coderabbitai · 2026-03-16T23:46:03Z

internal-packages/emails/emails/alert-error-group.tsx

+  project: "my-project",
+};
+
+export default function Email(props: AlertErrorGroupEmailProps) {


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # Verify default export removal in the new email file rg -nP --type=tsx '^\s*export\s+default\b' internal-packages/emails/emails/alert-error-group.tsx # Verify named import usage in EmailClient index rg -nP --type=tsx 'alert-error-group' internal-packages/emails/src/index.tsx -A3 -B3

Repository: triggerdotdev/trigger.dev

Length of output: 96

🏁 Script executed:

# Search for default export in alert-error-group file rg -n 'export\s+default' internal-packages/emails/emails/alert-error-group.tsx # Check the import statement in index.tsx rg -n 'alert-error-group' internal-packages/emails/src/index.tsx -A5 -B2 # Get file line counts to know safe read ranges wc -l internal-packages/emails/emails/alert-error-group.tsx internal-packages/emails/src/index.tsx

Repository: triggerdotdev/trigger.dev

Length of output: 906

🏁 Script executed:

# Read the alert-error-group file around the export line sed -n '55,65p' internal-packages/emails/emails/alert-error-group.tsx # Check if this export is imported/used anywhere else rg -n 'AlertErrorGroupEmail' internal-packages/emails/ --type=ts --type=tsx # Check the export in index.tsx around the import sed -n '1,20p' internal-packages/emails/src/index.tsx

Repository: triggerdotdev/trigger.dev

Length of output: 1261

🏁 Script executed:

# Check all usages of AlertErrorGroupEmail across the codebase rg -n 'AlertErrorGroupEmail' --max-count=20 # Also check if there are any other references to alert-error-group file rg -n 'alert-error-group' internal-packages/ --max-count=20

Repository: triggerdotdev/trigger.dev

Length of output: 1203

Replace the default export with a named function export.

Line 58 introduces a default export, which violates the repository export rule. Switch to a named export and update the import in internal-packages/emails/src/index.tsx accordingly.

Suggested patch

- export default function Email(props: AlertErrorGroupEmailProps) { + export function AlertErrorGroupEmail(props: AlertErrorGroupEmailProps) { const { classification, taskIdentifier, environment, error, occurrenceCount, errorLink,

-import AlertErrorGroupEmail, { - AlertErrorGroupEmailSchema, -} from "../emails/alert-error-group"; +import { + AlertErrorGroupEmail, + AlertErrorGroupEmailSchema, +} from "../emails/alert-error-group";

Per coding guidelines: **/*.{ts,tsx,js,jsx}: Use function declarations instead of default exports

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

export default function Email(props: AlertErrorGroupEmailProps) {

export function AlertErrorGroupEmail(props: AlertErrorGroupEmailProps) {

const {

classification,

taskIdentifier,

environment,

error,

occurrenceCount,

errorLink,

Suggested change

export default function Email(props: AlertErrorGroupEmailProps) {

import {

AlertErrorGroupEmail,

AlertErrorGroupEmailSchema,

} from "../emails/alert-error-group";

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal-packages/emails/emails/alert-error-group.tsx` at line 58, Change the default export of the component function Email to a named export (use "export function Email(...)" instead of "export default function Email(...)") and then update the corresponding import in the emails index module to use a named import (e.g. import { Email } from '...') or rename on import if needed; also update any usages that relied on the default import to reference the named export.

matt-aitken marked this pull request as draft March 5, 2026 23:24

devin-ai-integration bot reviewed Mar 5, 2026

View reviewed changes

ericallam approved these changes Mar 6, 2026

View reviewed changes

matt-aitken added 7 commits March 16, 2026 22:12

Add versions filtering to the Errors list page

b6ead44

Added versions filter to the fingerprint page

307f968

Stacked bars for the versions

1b05a0b

Removed legend

5e44b5f

Error alerting

82a5ea0

Use SegmentedControl for error filtering

de3a49f

Switch to a status dropdown

e609e8b

matt-aitken force-pushed the errors-add-version-filter branch from cd9b5ea to e609e8b Compare March 16, 2026 23:30

coderabbitai bot reviewed Mar 16, 2026

View reviewed changes

-export default function Email(props: AlertErrorGroupEmailProps) {
+export function AlertErrorGroupEmail(props: AlertErrorGroupEmailProps) {
+  const {
+    classification,
+    taskIdentifier,
+    environment,
+    error,
+    occurrenceCount,
+    errorLink,

-export default function Email(props: AlertErrorGroupEmailProps) {
+import {
+  AlertErrorGroupEmail,
+  AlertErrorGroupEmailSchema,
+} from "../emails/alert-error-group";

Uh oh!

Conversation

matt-aitken commented Mar 5, 2026

Uh oh!

changeset-bot bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

coderabbitai bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

❌ Failed checks (2 warnings)

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

changeset-bot bot commented Mar 5, 2026 •

edited

Loading

coderabbitai bot commented Mar 5, 2026 •

edited

Loading